27 research outputs found

    Information theoretic perspectives on en- and decoding in audition and vision

    Get PDF
    In cognitive neuroscience, encoding and decoding models mathematically relate stimuli in the outside world to neuronal or behavioural responses. While both stimuli and responses can be multidimensional variables, these models are on their own limited to bivariate descriptions of correspondences. In order to assess the cognitive or neuroscientific significance of such correspondences, a key challenge is to set them in relation to other variables. This thesis uses information theory to contextualise encoding and decoding models in example cases of audition and vision. In the first example, encoding models based on a certain operationalisation of the stimulus are relativised by models based on other operationalisations of the same stimulus material that are conceptually simpler and shown to predict the same neuronal response variance. This highlights the ambiguity inherent in an individual model. In the second example, a methodological contribution is made to the problem of relating the bivariate dependency of stimuli and responses to the history of response components with high degrees of predictability. This perspective demonstrates that only a subset of all stimulus-correlated response variance can be expected to be genuinely caused by the stimulus, while another subset is the consequence of the response’s own dynamics. In the third and final example, complex models are used to predict behavioural responses. Their predictions are grounded in experimentally controlled stimulus variance, such that interpretations of what the models predicted responses with are facilitated. Together, these three perspectives underscore the need to go beyond bivariate descriptions of correspondences in order to understand the process of perception

    Simple acoustic features can explain phoneme-based predictions of cortical responses to speech

    Get PDF
    When we listen to speech, we have to make sense of a waveform of sound pressure. Hierarchical models of speech perception assume that, to extract semantic meaning, the signal is transformed into unknown, intermediate neuronal representations. Traditionally, studies of such intermediate representations are guided by linguistically defined concepts, such as phonemes. Here, we argue that in order to arrive at an unbiased understanding of the neuronal responses to speech, we should focus instead on representations obtained directly from the stimulus. We illustrate our view with a data-driven, information theoretic analysis of a dataset of 24 young, healthy humans who listened to a 1 h narrative while their magnetoencephalogram (MEG) was recorded. We find that two recent results, the improved performance of an encoding model in which annotated linguistic and acoustic features were combined and the decoding of phoneme subgroups from phoneme-locked responses, can be explained by an encoding model that is based entirely on acoustic features. These acoustic features capitalize on acoustic edges and outperform Gabor-filtered spectrograms, which can explicitly describe the spectrotemporal characteristics of individual phonemes. By replicating our results in publicly available electroencephalography (EEG) data, we conclude that models of brain responses based on linguistic features can serve as excellent benchmarks. However, we believe that in order to further our understanding of human cortical responses to speech, we should also explore low-level and parsimonious explanations for apparent high-level phenomena

    Stimulus-driven brain rhythms within the alpha band:The attentional-modulation conundrum

    Get PDF
    Two largely independent research lines use rhythmic sensory stimulation to study visual processing. Despite the use of strikingly similar experimental paradigms, they differ crucially in their notion of the stimulus-driven periodic brain responses: One regards them mostly as synchronised (entrained) intrinsic brain rhythms; the other assumes they are predominantly evoked responses (classically termed steady-state responses, or SSRs) that add to the ongoing brain activity. This conceptual difference can produce contradictory predictions about, and interpretations of, experimental outcomes. The effect of spatial attention on brain rhythms in the alpha-band (8 -- 13 Hz) is one such instance: alpha-range SSRs have typically been found to increase in power when participants focus their spatial attention on laterally presented stimuli, in line with a gain control of the visual evoked response. In nearly identical experiments, retinotopic decreases in entrained alpha-band power have been reported, in line with the inhibitory function of intrinsic alpha. Here we reconcile these contradictory findings by showing that they result from a small but far-reaching difference between two common approaches to EEG spectral decomposition. In a new analysis of previously published human EEG data, recorded during bilateral rhythmic visual stimulation, we find the typical SSR gain effect when emphasising stimulus-locked neural activity and the typical retinotopic alpha suppression when focusing on ongoing rhythms. These opposite but parallel effects suggest that spatial attention may bias the neural processing of dynamic visual stimulation via two complementary neural mechanisms

    Degrees of algorithmic equivalence between the brain and its DNN models

    Get PDF
    Deep neural networks (DNNs) have become powerful and increasingly ubiquitous tools to model human cognition, and often produce similar behaviors. For example, with their hierarchical, brain-inspired organization of computations, DNNs apparently categorize real-world images in the same way as humans do. Does this imply that their categorization algorithms are also similar? We have framed the question with three embedded degrees that progressively constrain algorithmic similarity evaluations: equivalence of (i) behavioral/brain responses, which is current practice, (ii) the stimulus features that are processed to produce these outcomes, which is more constraining, and (iii) the algorithms that process these shared features, the ultimate goal. To improve DNNs as models of cognition, we develop for each degree an increasingly constrained benchmark that specifies the epistemological conditions for the considered equivalence

    Comparison of undirected frequency-domain connectivity measures for cerebro-peripheral analysis

    Get PDF
    Analyses of cerebro-peripheral connectivity aim to quantify ongoing coupling between brain activity (measured by MEG/EEG) and peripheral signals such as muscle activity, continuous speech, or physiological rhythms (such as pupil dilation or respiration). Due to the distinct rhythmicity of these signals, undirected connectivity is typically assessed in the frequency domain. This leaves the investigator with two critical choices, namely a) the appropriate measure for spectral estimation (i.e., the transformation into the frequency domain) and b) the actual connectivity measure. As there is no consensus regarding best practice, a wide variety of methods has been applied. Here we systematically compare combinations of six standard spectral estimation methods (comprising fast Fourier and continuous wavelet transformation, bandpass filtering, and short-time Fourier transformation) and six connectivity measures (phase-locking value, Gaussian-Copula mutual information, Rayleigh test, weighted pairwise phase consistency, magnitude squared coherence, and entropy). We provide performance measures of each combination for simulated data (with precise control over true connectivity), a single-subject set of real MEG data, and a full group analysis of real MEG data. Our results show that, overall, WPPC and GCMI tend to outperform other connectivity measures, while entropy was the only measure sensitive to bimodal deviations from a uniform phase distribution. For group analysis, choosing the appropriate spectral estimation method appears to be more critical than the connectivity measure. We discuss practical implications (sampling rate, SNR, computation time, and data length) and aim to provide recommendations tailored to particular research questions

    Grounding deep neural network predictions of human categorization behavior in understandable functional features: the case of face identity

    Get PDF
    Deep neural networks (DNNs) can resolve real-world categorization tasks with apparent human-level performance. However, true equivalence of behavioral performance between humans and their DNN models requires that their internal mechanisms process equivalent features of the stimulus. To develop such feature equivalence, our methodology leveraged an interpretable and experimentally controlled generative model of the stimuli (realistic three-dimensional textured faces). Humans rated the similarity of randomly generated faces to four familiar identities. We predicted these similarity ratings from the activations of five DNNs trained with different optimization objectives. Using information theoretic redundancy, reverse correlation, and the testing of generalization gradients, we show that DNN predictions of human behavior improve because their shape and texture features overlap with those that subsume human behavior. Thus, we must equate the functional features that subsume the behavioral performances of the brain and its models before comparing where, when, and how these features are processed

    Modeling individual preferences reveals that face beauty is not universally perceived across cultures

    Get PDF
    Facial attractiveness confers considerable advantages in social interactions,1,2 with preferences likely reflecting psychobiological mechanisms shaped by natural selection. Theories of universal beauty propose that attractive faces comprise features that are closer to the population average3 while optimizing sexual dimorphism.4 However, emerging evidence questions this model as an accurate representation of facial attractiveness,5, 6, 7 including representing the diversity of beauty preferences within and across cultures.8, 9, 10, 11, 12 Here, we demonstrate that Western Europeans (WEs) and East Asians (EAs) evaluate facial beauty using culture-specific features, contradicting theories of universality. With a data-driven method, we modeled, at both the individual and group levels, the attractive face features of young females (25 years old) in two matched groups each of 40 young male WE and EA participants. Specifically, we generated a broad range of same- and other-ethnicity female faces with naturally varying shapes and complexions. Participants rated each on attractiveness. We then reverse correlated the face features that drive perception of attractiveness in each participant. From these individual face models, we reconstructed a facial attractiveness representation space that explains preference variations. We show that facial attractiveness is distinct both from averageness and from sexual dimorphism in both cultures. Finally, we disentangled attractive face features into those shared across cultures, culture specific, and specific to individual participants, thereby revealing their diversity. Our results have direct theoretical and methodological impact for representing diversity in social perception and for the design of culturally and ethnically sensitive socially interactive digital agents

    Sources of Carbon Monoxide and Formaldehyde in North America Determined from High-Resolution Atmospheric Data

    Get PDF
    We analyze the North American budget for carbon monoxide using data for CO and formaldehyde concentrations from tall towers and aircraft in a model-data assimilation framework. The Stochastic Time-Inverted Lagrangian Transport model for CO (STILT-CO) determines local to regional-scale CO contributions associated with production from fossil fuel combustion, biomass burning, and oxidation of volatile organic compounds (VOCs) using an ensemble of Lagrangian particles driven by high resolution assimilated meteorology. In many cases, the model demonstrates high fidelity simulations of hourly surface data from tall towers and point measurements from aircraft, with somewhat less satisfactory performance in coastal regions and when CO from large biomass fires in Alaska and the Yukon Territory influence the continental US. Inversions of STILT-CO simulations for CO and formaldehyde show that current inventories of CO emissions from fossil fuel combustion are significantly too high, by almost a factor of three in summer and a factor two in early spring, consistent with recent analyses of data from the INTEX-A aircraft program. Formaldehyde data help to show that sources of CO from oxidation of CH4 and other VOCs represent the dominant sources of CO over North America in summer.Earth and Planetary Science

    Stimulus models test hypotheses in brains and DNNs

    No full text
    No abstract available